Playout Policy Adaptation for Games
نویسنده
چکیده
Monte Carlo Tree Search (MCTS) is the state of the art algorithm for General Game Playing (GGP). We propose to learn a playout policy online so as to improve MCTS for GGP. We test the resulting algorithm named Playout Policy Adaptation (PPA) on Atarigo, Breakthrough, Misere Breakthrough, Domineering, Misere Domineering, Go, Knightthrough, Misere Knightthrough, Nogo and Misere Nogo. For most of these games, PPA is better than UCT with a uniform random playout policy, with the notable exceptions of Go and Nogo.
منابع مشابه
Nested Rollout Policy Adaptation with Selective Policies
Monte Carlo Tree Search (MCTS) is a general search algorithm that has improved the state of the art for multiple games and optimization problems. Nested Rollout Policy Adaptation (NRPA) is an MCTS variant that has found record-breaking solutions for puzzles and optimization problems. It learns a playout policy online that dynamically adapts the playouts to the problem at hand. We propose to enh...
متن کاملPlayout policy adaptation with move features
Monte Carlo Tree Search (MCTS) is the state of the art algorithm for General Game Playing (GGP). We propose to learn a playout policy online so as to improve MCTS for GGP. We also propose to learn a policy not only using the moves but also according to the features of the moves. We test the resulting algorithms named Playout Policy Adaptation (PPA) and Playout Policy Adaptation with move Featur...
متن کاملMemorizing the Playout Policy
Monte Carlo Tree Search (MCTS) is the state of the art algorithm for General Game Playing (GGP). Playout Policy Adaptation with move Features (PPAF) is a state of the art MCTS algorithm that learns a playout policy online. We propose a simple modification to PPAF consisting in memorizing the learned policy from one move to the next. We test PPAF with memorization (PPAFM) against PPAF and UCT fo...
متن کاملOptimization of a packet video receiver under different levels of delay jitter: an analytical approach
This paper studies the problem of analyzing and designing optimal playout adaptation policies for packet video receivers (PVRs) that operate in a delay jitter inducing best-effort network, like the current Internet. The developed system model is built around the Ek/Di/1/N phase-type queue and allows for the effective modeling of key design and system parameters, such as: the level of delay jitt...
متن کاملPlayout Search for Monte-Carlo Tree Search in Multi-player Games
Monte-Carlo Tree Search (MCTS) has become a popular search technique for playing multi-player games over the past few years. In this paper we propose a technique called Playout Search. This enhancement allows the use of small searches in the playout phase of MCTS in order to improve the reliability of the playouts. We investigate max, Paranoid and BRS for Playout Search and analyze their perfor...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015